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Abstract 



We introduce a fast algorithm for computing sparse Fourier transforms supported 
on smooth curves or surfaces. This problem appear naturally in several important 
problems in wave scattering and reflection seismology. The main observation is that 
the interaction between a frequency region and a spatial region is approximately low 
rank if the product of their radii are bounded by the maximum frequency. Based on 
this property, equivalent sources located at Cartesian grids are used to speed up the 
computation of the interaction between these two regions. The overall structure of 
our algorithm follows the recently-introduced butterfly algorithm. The computation is 
further accelerated by exploiting the tensor-product property of the Fourier kernel in 
two and three dimensions. The proposed algorithm is accurate and has an 0(7VlogiV) 
complexity. Finally, we present numerical results in both two and three dimensions. 

Keywords. Fourier transform; Butterfly algorithm; Multiscale methods; Far field pat- 
tern. 
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1 Introduction 

We consider the rapid computation of the following sparse Fourier transform problem. Let 
be a large integer and X and K be two smooth (or piecewise smooth) curves in the 
unit box [0,1]^. Suppose {xi,i G /} and {kj^j G J} are, respectively, the samples of NX 
and NK, where NX := {N ■ p,p £ X} and NK is defined similarly. Given the sources 
{fjjj G J} at {kj,j G J}, the problem is to compute the potentials {ui, i £ 1} defined by 



where i = \/— T. In most cases, {xi,i G /} and {kj,j G J} sample NX and NK with a 
constant number of samples per unit length. As a result, {xi,i G 1} and {kj,j G J} are of 
size 0{N). A similar problem can be defined in three dimensional space where X and K 
are smooth surfaces in [0, 1]^. However, our discussion here focuses on the two dimensional 
case, as the algorithm can be copied verbatim to the three dimensional case. 

Direct evaluation of ([T]) clearly requires O(A^) steps, which can be quite expensive 
for large values of N. In this paper, we propose an 0{N log N) approach based on the 
butterfiy algorithm [12l[13]. Our algorithm starts by generating two quadtrees Tx and Tk 
for {xi} and {kj}, respectively, where each of their leaf boxes is of unit size. The main 
observation is the following low rank property. Let A and B be are two boxes from Tx 




(1) 



and Tk, respectively. If the widths w"^ and of these two boxes satisfy the condition 
■ < N, then the interaction g^'^*^''^/^ between x G A and k G B is approximately 
of low rank. This property implies that one can reproduce the potential in A with a set of 
equivalent charges whose degree of freedom is bounded by a constant independent of A^. 

The algorithm first constructs the equivalent charges for the leaf boxes of Tk- Next, we 
traverse up Tk to construct the equivalent charges of the non-leaf boxes from the ones of 
their children. For each non-leaf box B, we construct a set of equivalent charges for each 
box A in Tx that satisfies = N/w^ . At the end of this step, we hold the equivalent 
charges of the root box of Tk for each leaf box of Tx ■ Finally, we visit all of the leaf boxes 
of Tx and utilize the equivalent charges of the root box of Tk to compute the potentials 
{ui,i £ I}. 

The sparse Fourier transforms in ([T]) appears naturally in several contexts. One example 
is from the calculation of the time-harmonic scattering field [S] . Suppose that D is a smooth 
surface and is the wave number. The scattering field u satisfies the Helmholtz equation 
—N'^u — Au = in M*^ \ D. Its far field pattern for x on the unit sphere, which is 

highly important for many scattering problems, is defined by 

uoo(x) = / e-'''^-yfiy)ds{y), (2) 

JdD 

where / is some function supported on the boundary dD. After we rescale x and y by 
a factor of A^, ^ takes the form of ([T|. Another example of ([T| appears in the depth 
stepping algorithm in reflection seismology [10]. 

The rest of this paper is organized as follows. In Section [2| we prove the main analytic 
result and describe our algorithm in detail. After that, we provide numerical results for 
both the two and three dimensional cases in Section |3| Finally, we conclude in Section [4] 
with some discussions on future work. 



2 Algorithm Description 



The main theoretical component of our algorithm is the following theorem. Following the 
discussion in Section [T] we use Tx and Tk to denote the quadtrees generated from X and 
K, respectively. 

Theorem 2.1. Let A be a box in Tx and B be a box in Tk- Suppose the width w"^ and 
of A and B satisfy w^w^ = N . Then, for any e > 0, there exists a constant T{e) and 
functions {at{x), I <t < T{e)} and {f3t{k), I <t < T(e)} such that 



J2mx-k/N 



Tie) 

at{x)Pt{k) 

t=i 



< e 



for any x £ A and k £ B. 



The proof of Theorem 2.1 is based on the following elementary lemma. 
Lemma 2.2. For any Z > and e > 0, let S = [max(4e7rZ, log2(l/e))] • Then 

s-i 



(27rzx)* 



t\ 



< £ 



for any x with \x\ < Z. 
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Proof of Theorem 2.1. Let us use = (c^,c^) and = (cf ,cf ) to denote the lower left 
corners of boxes A and B, respectively. Writing x = c"^ + x' and k = + k', we have 

2mx-k/N _ 2m{c^+x'y{c^ +k')/N _ 2-Ki{c^+x'yc^ /N _ 2-nix'-k' /N _ 2mc'^-k' /N 



Notice that the first and the third terms depends only on x' and k\ respectively. Therefore, 
we only need to construct a factorization for the second term. Since \x'\ < y/2w^ and 
l^'l < V^w^ , \x' ■ k'\/N < 2. Invoking the lemma for Z = 2, we obtain the following 
approximation with S'(e) terms: 



^2inx'-k'/N 



S{e)-1 

E 

t=o 



{2mx' • k'/NY 



tl 



< £. 



After expanding each term of the sum using x' -k' = {x^k'^^ +x'2k'2)., we have an approximate 
expansion 

T{e) 



^2mx-k/N 



at{x)f5t{k) 



t=i 



< £ 



where T(e) only depends on the accuracy e. 



□ 



2.1 Equivalent sources 

Given two boxes A and B with w^w^ = N, we denote u^^{x) the potential field in A 
generated by the charges inside B: 

j-.kj&B 

The theorem implies that the field can be approximately reproduced by a group of 
carefully selected equivalent charges inside B. For the efficiency reason to be discussed 
shortly, we pick these charges to be located on a Cartesian grid in B, 



k^ 



Im 



cf + l -,cf +m 

p — I p — 1 



/, m = 0, 1, • • • p — 1 



where p is a constant whose value depends on the prescribed accuracy e. The corresponding 
equivalent charges are denoted by {fii^}. To construct {f{^}, we first select a group of 
points 



A A 
A , W A W 

p — 1 p — I 



, l,m = 0,l,---p-l} . 



located on a Cartesian grid in A. {fi^} are computed by ensuring that they reproduce 



the they generate the field {u{^ := ^^^{^{lyj} at the points {x^}; i.e., 

/ J ^ J I'm' "'Im 



Writing this into a matrix form Mf = u and using the definitions of {x^} and {A;^^^/}, we 
can decompose the p^ x p^ matrix M into a Kronecker product M = Mi M2, where 



Ml 



2tti 



Mo 



2m 
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Since (Mi ® M2) ^ = ^ M2 ^, in order to compute f = M we only need to invert 
the p X p matrices Mi and M2. Expanding the formula for Mi, we get 



(M: 



1 w 



2m , .s 



(3) 



Noticing that the first and the third terms depend only on / and respectively, and 
wawb/N = 1, we can rewrite Mi into a factorization Mi = Mn • G ■ M12, where Mn 

and M12 are two diagonal matrices and the center matrix G given by {G)ii' = e (p-^ 
is independent of N and the boxes A and B. The situation for M2 is exactly the same. 
The result of this discussion is that we reduce the complexity of / = M~^u from O(p^) to 
0{p^) using the Kronecker product structure of M. In fact, one can further reduce it to 
OijP'logp) since the matrix G is a fractional Fourier transform (see P]). 



B 
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Figure 1: The construction of {//^} using {f^^"}- Be is one of i?'s child boxes and A' is 



A's parent box. We first evaluate the potentials {u-j^} located at {x^} using f^^^ 
then find {fi^} so that they produce the same potentials. 



and 



The procedure we just described fails to be efficient when we compute {//^} for a large 
box B, as typically B contains a large number of points {kj} and this makes the evaluation 
of {w^} quite expensive. The second ingredient of the butterfly algorithm addresses this 
problem. In our setting, suppose that i? is a non-leaf box of Tk, j4 is a box in Tx, and 
w^w^ = N. We denote the children of B by Be, c = 1, • • • ,4 and the parent of A by A'. 
Suppose that one constructs the equivalent charges in a bottom- up traversal of Tk- Hence, 
when one reaches B, the equivalent charges of Be have already been computed. The idea 
is then to use the equivalent charges {f^^^"} of Be to compute the check potentials {u-^} 
since A C A'; i.e., 

4 / \ 



EE' 

c=l \l'm' 



2mx 



Im I'm 



J I'm' 



for any I, m. The inner sum X^^/^ 



2mx 



g ;m' I'm'i" f^^'' for each fixed i can be rewritten into 
a matrix form Ef. Using again the Kronecker product, we can decompose E as E\® E2 
where 



,/N fA'B, 
h'm' 



E, 



+1' 



/N 



Eo 



2m 



+m' — 
V- 



/N 



Expanding the formula for Ei, we get 



2m(cf+l^)c?'=/N 27rr^-^i5^%^ 2mc^(l'^)/N 



(4) 
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Noticing that the first and the third terms depend only on I and I' respectively and 

w^w^'/N = 1/2, we can write Ei into a factorization Ei = En ■ H ■ E12 where En and E12 

a' 

are again diagonal matrices and the matrix H given by {H)iii = e ~(p-^ is independent of 
N. 

2.2 Algorithm 

We now give the overall structure of our algorithm. It contains the following steps: 

1. Construct the quadtrees Tx and Tk for the point sets {xi,i E 1} and {kj^j E J}, 
respectively. These trees are constructed adaptively and all the leaf boxes are of unit 



2. Construct the equivalent charges for the leaf boxes in Tk- Suppose that ^0 is the 
root box of Tx- For each leaf box B in Tk, we calculate {/j^^} by matching the 
potentials {uf^^}- 

3. Travel up in Tk and construct the equivalent charges for the non-leaf boxes in Tk- 
For each non-leaf box B in Tk and each box A in Tx with width w"^ = N/w^ , we 
construct {f^} from the equivalent charges of {fi^^"} where A' is ^'s parent and 
{Be, c = 1, • • • ,4} are the B^s children. 

4. Compute {uj,j E J}. Let Bq be the root box of Tk- For each leaf box A in Tx and 
each j such that xj E A, we approximate Uj with 



Let us first estimate the number of operations of the proposed algorithm. The first 
step clearly takes only 0(A^log A^) operations since there are at most 0{N) points in both 
{xi,i E /} and {kj,j E J}. By assumption, both X and K are smooth (or piecewise 
smooth) curves in [0,1]^. Therefore, for a given size w, there are at most 0{N/w) non- 
empty boxes in both Tx and Tk- In particular, we have at most 0{N) leaf boxes in Tx 
and Tk- This implies that the second and fourth steps take at most 0{N) operations. To 
analyze the third step, we estimate level by level. For a fixed level t, there are at most 
2* non-empty boxes in Tk on that level, each of size For each box B on level t, we 

need to construct {fi^} for all the boxes A in Tx with size N/{N/2^) = 2*. It is clear 
that there are at most non-empty boxes in Tx of that size. Since the construction for 
each set of equivalent charges take only constant operations, the total complexity for level 
t is 0(2* X = 0{N). As there are at most O(logiV) levels, the third step takes at 

most 0(A'"logA^) operations. Summing over all the steps, we conclude that our algorithm 
is 0{N log N). 

Our algorithm is also efficient in terms of storage space. Since we give explicit con- 
struct formulas Q and Q for constructing the equivalent charges, we only need to store 
the equivalent charges during the computation. This is where our algorithm differs from 
the one in p3] where they need to require 0{N log N) small matrices for the interpolative 
decomposition [71 [11]. As we mentioned early, at each level, we need to keep 0{N) equiv- 
alent charges, each of which takes 0(1) storage space. Noticing that, at any point of the 
algorithm, we only need to keep the equivalent charges for two adjacent levels, therefore 
the storage requirement of our algorithm is only 0{N). 



size. 




Im 
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The three dimensional case is similar. Since X and K are smooth surfaces in [0, 1]^, the 
number of points in {xi,i G /} and {kj,j £ J} are 0{N^) instead. The Kronecker product 
decomposition is still valid and, therefore, we can construct the equivalent charges efficiently 
in 0{p'^) (or even 0{p'^logp)) operations instead of 0{p^). The algorithm remains exactly 
the same and a similar complexity analysis gives an operation count of 0{N^ log N), which 
is almost linear in terms of the number of points. 

3 Numerical Results 

In this section, we provide some numerical examples to illustrate the properties of our 
algorithm. All of the numerical results are obtained on a desktop computer with a 2.8GHz 
CPU. The accuracy of the algorithm depends on p, which is the size of the Cartesian grid 
used for the equivalent charges. In the following examples, we pick p = 5,7, or 9. The 
larger the value of p, the better the accuracy. 

3.1 2D case 

For each example, we sample the curves NX and NK with 5 points per unit length. 
{fjjj ^ J} ^I's randomly generated numbers with mean 0. Suppose we use {uf,i E /} to 
denote the results of our algorithm. To study the accuracy of our algorithm, we pick a set 
5* C / of size 200 and estimate the error by 

where {ui,i £ S} are the exact potentials computed by direct evaluation. 

Before reporting the numerical results, let us summarize the notations that are used 
in the tables. is the size of the domain, p is the size of the Cartesian grid used for the 
equivalent charges, P is the maximum of the numbers of points in {xj} and {kj}, Ta is the 
running time of our algorithm in seconds, is the estimated running time of the direct 
evaluation in seconds, T^/Ta is the speedup factor, and finally £a is the approximation error. 

Tables [T] and [2] report the results for two testing examples. From these tables, it is quite 
clear that the complexity of our algorithm grows indeed almost linearly in terms of the 
number of points, and its accuracy is stably controlled by the value of p. For larger values 
of N, we obtain a substantial speedup over the direct evaluation. 

3.2 3D case 

We apply our algorithm to the problem of computing the far field pattern In this setup, 
X is always a sphere, while K is the boundary of the scatter. We sample the surface NX 
and NK again with about 5 x 5 = 25 points per unit area. Tables |3] and [4] summarize the 
results of two typical examples in evaluating the far field pattern of scattering fields. 

4 Conclusions and Discussions 

In this paper, we introduced an efficient algorithm for computing sparse Fourier transforms 
located on curves and surfaces. Our algorithm, which is an extension of the butterfly algo- 
rithm, is accurate and has provably 0{N log N) complexity. The success of the algorithm 
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is based on an low rank property concerning the interaction between spatial and frequency 
regions that follow a certain geometrical condition. We use equivalent sources supported on 
Cartesian grids as the low rank representation and exploit the tensor-product property of 
the Fourier transform to achieve maximum efficiency. Furthermore, our algorithm requires 
only linear storage space. 

The problem considered in this paper is only one example of many computational issues 
regarding highly oscillatory behaviors. Some other examples include the computation of 
Fourier integer operators [T^, scattering fields for high frequency waves [8j, and Kirchhoff 
migrations [3]. In the past two decades, many algorithms have been developed to address 
these challenging computational tasks efficiently. Some examples include jH El [H |5l |9] . It 
would be interesting to see whether the ideas behind these approaches can be used to study 
the problem addressed in this paper, and vice versa. 

Acknowledgments. The research presented in this paper was supported by an Alfred 
P. Sloan Fellowship and a startup grant from the University of Texas at Austin. 
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X 

1 1 ' ' ' ^ 

0.9- 
0.8- 

0.7 ^^^^ 

0.6 -X^ 

0.5 

0.4 -\ 

0.3 ^"^^^^^^ ^^^^ 

0.2 

0.1 - 

"o 0.2 0.4 0.6 0.8 



{N,p) 


P 


Ta (sec) 


Td(sec) 


Td/Ta 




(1024,5) 


1.64e+4 


1.23e+0 


3.52e+l 


2.86e+l 


1.66e-3 


(2048,5) 


3.28e+4 


2.71e+0 


1.46e+2 


5.38e+l 


1.76e-3 


(4096,5) 


6.55e+4 


6.05e+0 


5.77e+2 


9.53e+l 


1.94e-3 


(8192,5) 


1.31e+5 


1.32e+l 


2.35e+3 


1.78e+2 


1.97e-3 


(16384,5) 


2.62e+5 


2.89e+l 


9.40e+3 


3.25e+2 


2.00e-3 


(32768,5) 


5.24e+5 


6.27e+l 


3.76e+4 


5.99e+2 


2.11e-3 


(1024,7) 


1.64e+4 


2.07e+0 


3.60e+l 


1.74e+l 


9.31e-6 


(2048,7) 


3.28e+4 


4.64e+0 


1.44e+2 


3.11e+l 


9.20e-6 


(4096,7) 


6.55e+4 


1.03e+l 


5.83e+2 


5.68e+l 


1.03e-5 


(8192,7) 


1.31e+5 


2.27e+l 


2.35e+3 


1.04e+2 


1.04e-5 


(16384,7) 


2.62e+5 


5.14e+l 


9.40e+3 


1.83e+2 


1.07e-5 


(32768,7) 


5.24e+5 


1.18e+2 


3.76e+4 


3.18e+2 


1.18e-5 


(1024,9) 


1.64e+4 


3.31e+0 


3.60e+l 


1.09e+l 


4.77e-8 


(2048,9) 


3.28e+4 


7.23e+0 


1.44e+2 


1.99e+l 


5.85e-8 


(4096,9) 


6.55e+4 


1.59e+l 


5.80e+2 


3.65e+l 


5.05e-8 


(8192,9) 


1.31e+5 


3.57e+l 


2.35e+3 


6.59e+l 


5.75e-8 


(16384,9) 


2.62e+5 


7.74e+l 


9.40e+3 


1.21e+2 


6.16e-8 


(32768,9) 


5.24e+5 


1.87e+2 


3.76e+4 


2.01e+2 


5.94e-8 




Table 1: 2D results. Top: Both X and K are ellipses in unit box [0, 1]^. Bottom: Running 
time, speedup factor and accuracy for different combinations of and p. N is the size of 
the domain, p is the size of the Cartesian grid used for the equivalent charges, P is the 
maximum of the numbers of points in {xj} and {A:j}, Ta is the running time of our algorithm 
in seconds, T^, is the estimated running time of the direct evaluation in seconds, T^/Ta is 
the speedup factor, and finally is the approximation error. 
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X 

1 1 ^ ^ — ^ 

0.9- / \ 

0.8- / \ 

0.7- J 
0.6 - /''^"^ 
0.5 

0.4 X^^^^ 

0.3- ~A r 

0.2- 1 / 

0.1 - \ I 

"o 0.2 0.4 0.6 0.8 



(iV,p) 


P 


ra(sec) 


Trf(sec) 




ea 


(1024,5) 


1.64e+4 


1.93e+0 


3.60e+l 


1.87e+l 


1.52e-3 


(2048,5) 


3.28e+4 


4.25e+0 


1.46e+2 


3.43e+l 


1.67e-3 


(4096,5) 


6.55e+4 


9.49e+0 


5.80e+2 


6.11e+l 


1.66e-3 


(8192,5) 


1.31e+5 


2.06e+l 


2.34e+3 


1.14e+2 


1.69e-3 


(16384,5) 


2.62e+5 


4.59e+l 


9.50e+3 


2.07e+2 


1.99e-3 


(32768,5) 


5.24e+5 


9.85e+l 


3.78e+4 


3.84e+2 


1.84e-3 


(1024,7) 


1.64e+4 


3.20e+0 


3.60e+l 


1.13e+l 


7.81e-6 


(2048,7) 


3.28e+4 


7.12e+0 


1.44e+2 


2.02e+l 


8.26e-6 


(4096,7) 


6.55e+4 


1.60e+l 


5.77e+2 


3.60e+l 


8.79e-6 


(8192,7) 


1.31e+5 


3.54e+l 


2.34e+3 


6.61e+l 


9.25e-6 


(16384,7) 


2.62e+5 


8.14e+l 


9.52e+3 


1.17e+2 


9.07e-6 


(32768,7) 


5.24e+5 


1.91e+2 


3.76e+4 


1.97e+2 


1.07e-5 


(1024,9) 


1.64e+4 


5.02e+0 


3.52e+l 


7.02e+0 


4.24e-8 


(2048,9) 


3.28e+4 


1.12e+l 


1.43e+2 


1.28e+l 


4.77e-8 


(4096,9) 


6.55e+4 


2.47e+l 


5.87e+2 


2.37e+l 


4.65e-8 


(8192,9) 


1.31e+5 


5.60e+l 


2.33e+3 


4.17e+l 


4.35e-8 


(16384,9) 


2.62e+5 


1.24e+2 


9.40e+3 


7.60e+l 


4.99e-8 


(32768,9) 


5.24e+5 


2.84e+2 


3.76e+4 


1.32e+2 


6.04e-8 



Table 2: 2D results. Top: X and K are two smooth curves in unit box [0,1]^. Bottom: 
Running time, speedup factor and accuracy for different combinations of and p. 
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{N,p) 


P 


Ta(sec) 


7d(sec) 


Td/Ta 




(16,5) 


2.14e+4 


2.49e+0 


1.50e+l 


6.03e+0 


1.24e-3 


(32,5) 


8.19e+4 


9.78e+0 


1.88e+2 


1.93e+l 


1.52e-3 


(64,5) 


3.22e+5 


3.94e+l 


2.77e+3 


7.03e+l 


1.44e-3 


(128,5) 


1.28e+6 


1.62e+2 


4.39e+4 


2.71e+2 


1.68e-3 


(256,5) 


5.13e+6 


6.77e+2 


7.00e+5 


1.03e+3 


1.79e-3 


(16,7) 


2.14e+4 


6.77e+0 


1.50e+l 


2.22e+0 


5.80e-5 


(32,7) 


8.19e+4 


2.66e+l 


1.80e+2 


6.79e+0 


7.24e-5 


(64,7) 


3.22e+5 


1.07e+2 


2.74e+3 


2.55e+l 


7.98e-5 


(128,7) 


1.28e+6 


4.40e+2 


4.39e+4 


9.98e+l 


7.89e-5 


(16,9) 


2.14e+4 


1.43e+l 


1.50e+l 


1.05e+0 


2.40e-7 


(32,9) 


8.19e+4 


5.64e+l 


1.88e+2 


3.34e+0 


3.25e-7 


(64,9) 


3.22e+5 


2.28e+2 


2.74e+3 


1.20e+l 


3.24e-7 


(128,9) 


1.28e+6 


9.38e+2 


4.44e+4 


4.74e+l 


3.33e-7 



Table 3: 3D results. Top: the surface K is the boundary of an F16 airplane model. Bottom: 
Running time, speedup factor and accuracy for different combinations of N and p. 
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{N,p) 


P 


Ta(sec) 


7d(sec) 


Td/Ta 


ea 


(16,5) 


2.14e+4 


3.20e+0 


1.07e+l 


3.35e+0 


1.45e-3 


(32,5) 


8.19e+4 


1.25e+l 


1.39e+2 


1.12e+l 


1.65e-3 


(64,5) 


3.22e+5 


5.11e+l 


2.13e+3 


4.17e+l 


1.79e-3 


(128,5) 


1.28e+6 


2.02e+2 


3.46e+4 


1.71e+2 


2.19e-3 


(256,5) 


5.13e+6 


8.31e+2 


5.54e+5 


6.67e+2 


1.94e-3 


(16,7) 


2.14e+4 


8.72e+0 


1.07e+l 


1.23e+0 


7.06e-5 


(32,7) 


8.19e+4 


3.39e+l 


1.39e+2 


4.11e+0 


7.57e-5 


(64,7) 


3.22e+5 


1.35e+2 


2.13e+3 


1.57e+l 


9.05e-5 


(128,7) 


1.28e+6 


5.48e+2 


3.44e+4 


6.28e+l 


1.04e-4 


(16,9) 


2.14e+4 


1.85e+l 


8.58e+0 


4.64e-l 


2.61e-7 


(32,9) 


8.19e+4 


7.19e+l 


1.39e+2 


1.94e+0 


3.00e-7 


(64,9) 


3.22e+5 


2.87e+2 


2.13e+3 


7.41e+0 


4.38e-7 


(128,9) 


1.28e+6 


1.17e+3 


3.52e+4 


3.01e+l 


3.46e-7 



Table 4: 3D results. Top: the surface K is the boundary of a submarine model. Bottom: 
Running time, speedup factor and accuracy for different combinations of N and p. 
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